1 research outputs found
Architectural support for persistent memory systems
The long stated vision of persistent memory is set to be realized with the release of
3D XPoint memory by Intel and Micron. Persistent memory, as the name suggests,
amalgamates the persistence (non-volatility) property of storage devices (like disks)
with byte-addressability and low latency of memory. These properties of persistent
memory coupled with its accessibility through the processor load/store interface enable
programmers to design in-memory persistent data structures.
An important challenge in designing persistent memory systems is to provide support
for maintaining crash consistency of these in-memory data structures. Crash consistency
is necessary to ensure the correct recovery of program state after a crash. Ordering
is a primitive that can be used to design crash consistent programs. It provides
guarantees on the order of updates to persistent memory. Atomicity can also be used
to design crash consistent programs via two primitives. First, as an atomic durability
primitive which guarantees that in the presence of system crashes updates are made
durable atomically, which means either all or none of the updates are made durable.
Second, in the form of ACID transactions that guarantee atomic visibility and atomic
durability.
Existing systems do not support ordering, let alone atomic durability or ACID.
In fact, these systems implement various performance enhancing optimizations that
deliberately reorder updates to memory. Moreover, software in these systems cannot
explicitly control the movement of data from volatile cache to persistent memory.
Therefore, any ordering requirement has to be enforced synchronously which degrades
performance because program execution is stalled waiting for updates to reach persistent
memory. This thesis aims to provide the design principles and efficient implementations
for three crash consistency primitives: ordering, atomic durability and ACID
transactions.
A set of persistency models have been proposed recently which provide support for
the ordering primitive. This thesis extends the taxonomy of these models by adding
buffering, which allows the hardware to enforce ordering in the background, as a new
layer of classification. It then goes on show how the existing implementation of a
buffered model degenerates to a performance inefficient non-buffered model because
of the presence of conflicts and proposes efficient solutions to eliminate or limit the
impact of these conflicts with minimal hardware modifications. This thesis also proposes
the first implementation of a buffered model for a server class processor with
multi-banked caches and multiple memory controllers.
Write ahead logging (WAL) is a commonly used approach to provide atomic durability.
This thesis argues that existing implementations ofWAL in software are not only
inefficient, because of the fine grained ordering dependencies, but also waste precious
execution cycles to implement a fundamentally data movement task. It then proposes
ATOM, a hardware log manager based on undo logging that performs the logging operation
out of the critical path. This thesis presents the design principles behind ATOM
and two techniques that optimize its performance. These techniques enable the memory
controller to enforce fine grained ordering required for logging and to even perform
logging in some cases. In doing so, ATOM significantly reduces processor stall cycles
and improves performance.
The most commonly used abstraction employed to atomically update persistent
data is that of durable transactions with ACID (Atomicity, Consistency, Isolation
and Durability) semantics that make updates within a transaction both visible and
durable atomically. As a final contribution, this thesis tackles the problem of providing
efficient support for durable transactions in hardware by integrating hardware
support for atomic durability with hardware transactional memory (HTM). It proposes
DHTM (durable hardware transactional memory) in which durability is considered as
a first class design constraint. DHTM guarantees atomic durability via hardware redo-logging,
and integrates this logging support with a commercial HTM to provide atomic
visibility. Furthermore, DHTM leverages the same logging infrastructure to extend the
supported transaction size, from being L1-limited to the LLC, with minor changes to
the coherence protocol